<!-- Copyright 2017 Capital One Services, LLC and Bitwise, Inc.
 Licensed under the Apache License, Version 2.0 (the "License");
 you may not use this file except in compliance with the License.
 You may obtain a copy of the License at
 http://www.apache.org/licenses/LICENSE-2.0
 Unless required by applicable law or agreed to in writing, software
 distributed under the License is distributed on an "AS IS" BASIS,
 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 See the License for the specific language governing permissions and
 limitations under the License. -->
 
<!doctype html>
<html>
<head>
	<title>GroupCombine Properties</title>
	<link rel="stylesheet" type="text/css" href="../../css/style.css">
</head>
<body>

<p><span class="header-1">GroupCombine Properties</span></p>

<p><span><b>Properties</b>&nbsp;for the GroupCombine component can be viewed by double clicking the component on canvas. The properties window can also be opened by right-clicking the component icon on the job canvas and clicking on the 'Properties' option.</span></p>
<p><span>The properties contain a &#39;General&#39; tab and a &#39;Schema&#39; tab. Common properties are present in the General tab. Schema tab displays the option to accept the field schema i.e. field name, data type, scale etc. </span></p>

<p><a name="general_properties"></a><span class="header-2">General Properties</span></p>

<p><img alt="" src="../../images/GroupCombine_Properties_General.png" /></p>

<p><span class="header-2">Display</span></p>

<ul>
	<li><span><b>Name</b> - The identifier for the component. This is a <b>mandatory</b> property. This property is pre-populated with the component name, i.e. 'GroupCombine' followed by an incremental number. It can be changed to any custom name. The name property has following restrictions:</span></li>
	<ul>
		<li><span>Must be specified and should not be blank.</span></li>
		<li><span>Must be unique across the job.</span></li>
		<li><span>Accepts only alphabets (a-z), numerals (0-9) and 4 special characters: "_", "-", ",", " " (space)<./span></li>
	</ul>
	<li><span><b>ID</b> - ID field will specify unique id for every component. </span></li>
	<li><span><b>Type</b> - Type defines the type of component within the category. This typically is the name of the component. This is a non editable field.</span></li>
</ul>

<p><span class="header-2">Configuration</span></p>

<ul>
	<li><a name="key_fields"></a><span><b>Key Fields</b> - Key Fields opens up a grid that accepts the key fields for the combination operations. The records are grouped based on these key fields. Key fields also supports defining multiple fields as a key. The order specified in the key fields grid is maintained while performing the combination operation. The records are always sorted in ascending order on keys.</span>
	<p><span>Key fields is a <b>mandatory</b> property. A  minimum of one field should be specified as a the key field. </span></p></li>
</ul>
	<p><img alt="" src="../../images/GroupCombine_Key_Field.png" /></p>

<ul>	
<li><span><b>GroupCombine</b> - Edit button opens up the '<a href="GroupCombine_Operation_Editor.html">Operation Editor</a>' view, where the user can define the custom combination operation. The user can select the input fields needed for the combination operation, write custom operation classes (or select existing standard operation classes), extract the output fields all in the same view.
	<p><span>This is a <b>mandatory</b> property.</span></p>

	<ul>
		<li><span><b>Expression Editor Or Operation Class</b> -
				Edit option of GroupCombine property window allows user to choose only one
				option,i.e either Expression or Operation.</span>

			<p>
				<span><b>Operation Class</b> : Selecting Operation opens up a
					grid for the user to select for an existing operation class or
					create a new one. The operation class name can be parameterized
					which will be resolved at runtime. Click on <a
					href="GroupCombine_Operation_Class_Window.html">Operation Class</a></span>&nbsp;<span>
					to know more about creating and using Standard classes here.</span>
			</p>
			<p><span>When configuring with Operation class, Below methods are available in custom class:</span></p>
			<p><span>1.	<b>Initialize</b>(Reusable bufferRow) -> This method is used to initialize the input fields.
			(ex.: set the bufferRow as like bufferRow.setField(0, 0L);)</span></p>

			<p><span>2.	<b>update</b>(ReusableRow bufferRow, ReusableRow inputRow)->This method is used to perform the partial aggregation on the respective nodes in distributed environment. 
			Here bufferRow is like  “accumulator”  variable, and inputRow is input fields provided by user. This variable should be used in processing of each row of group. 
			(ex. : bufferRow.setField(0, ((Long) bufferRow.getField(0)) + inputRow.getLong(0) );)</span></p>

			<p><span>3.	<b>merge</b>(ReusableRow bufferRow1, ReusableRow bufferRow2) -> This method Merges two aggregation buffers and stores the updated buffer values back to `buffer1`.
            (ex.: bufferRow1.setField(0, ((Long) bufferRow1.getField(0)) + ((Long) bufferRow2.getField(0)));)</span></p>
			4. <b>evaluate</b>(ReusableRow bufferRow, ReusableRow outRow-> Evaluate method is the finalized operation which will be used to assign the aggregated values to Output variables/fields. This method calculates the final result of this UserDefinedAggregateFunction based on the given aggregation buffer.
			(ex. : outRow.setField(0, bufferRow.getField(0));)
		</ul>
				<p><img alt="" src="../../images/Aggregate_Mapping_View.png" width="100%"/></p>
		<ul>

			</p>
			<p>
				<span><b>Expression Editor</b> : Selecting Expression opens
					expression editor window. Click on <a
					href="Expression_Editor_window_for_groupCombine.html">Expression Editor</a> to know
					more about Expression Editor .</span>
			</p>
			<p><span>When configuring with Expression Editor:</span></p>
		<p><span>1.	Expression editor will contain by default 3 variables defined i.e. “accumulator”  , “accumulator1” and “accumulator2”  .</span></p>
		<p><span>2.	As in existing Aggregate component, only “accumulator”  variable needs to be initialized by user 
		Like (Initialize(Reusable bufferRow) method in operation class)</span></p>
		<p><span>3.	Other 2 accumulators  will be internally used in expression to merge two partially aggregated data.
		Like(merge(ReusableRow bufferRow1, ReusableRow bufferRow2) method in operation class)</span></p>

		<p><span>4.	On UI, 2 Expression Editors need to be configured for every output field.
		-	1st Expression will be similar to existing Aggregation with Expression.
		-	2nd expression editor will be for “Merge Expression”, user needs to use same expression as above but with “accumulator1” and “accumulator2”.
		Example: In 1st expression ,user can write expression as : 
		Accumulator = 0
		Expression = accumulator + salary (input Fields)</span></p>
			
			<p>
				<img src="../../images/Expression_Editor_Expression1_Window.png" />
			</p></li>
	</ul>
	
	<p>
				<span><b>Merge Expression</b> : In 2nd Merge Expression,
		Merge = accumulator1 + accumulator2.</span>
			</p>
			<p>
				<img src="../../images/Expression_Editor_Merge_Expression_Window.png" />
			</p>
	</li>
	<li><a name="runtime_properties"></a><span><b>Runtime Properties</b> - Runtime properties are used to override the Hadoop configurations specific to this GroupCombine component at run time. User is required to enter the Property Name and Property Value in the runtime properties grid.
	<p><span>Check <a href="../../How To Steps/How_To_Pass_Hadoop_Properties_To_Component.html"> How to pass Hadoop properties to component</a></span></p></li>
</ul>
	<p><img alt="" src="../../images/Runtime_Properties_Grid.png" /></p>
<ul>
	<li><span><b>Batch</b> - Batch accepts an integer value and signifies the batch this component will execute in. The default value for batch is 0. Batch can have a maximum value of 99. Batch is a <b>mandatory</b> property.</span></li>
</ul>

<p><a name="schema"></a><span class="header-2">Schema Tab</span></p>

<p><img alt="" src="../../images/GroupCombine_Properties_Schema.png" /></p>

<p><span>Schema is <b>mandatory</b> for GroupCombine component. Schema tab defines the record format on the out port of the GroupCombine component. A field in schema has multiple attributes as described below.</span></p>
<ul>
	<li><span><b>Field Name</b> - The name for the field. This is a mandatory attribute.</span></li>
	<li><span><b>Data type</b> - The data type for the field. This is a mandatory attribute. The default data type is "String". Check supported data types page for list of supported data types.</span></li>
	<li><span><b>Scale</b> - The number of digits to the right of decimal point. Scale is defined for Double, Float or BigDecimal field.</span></li>
	<li><span><b>Scale Type</b> – Scale Type accepts values as implicit or explicit for BigDecimal field and none for other data types. Explicit considers the length of '.' in precision and implicit ignores length of '.' precision for the BigDecimal field.</span></li>
	<li><span><b>Date Format</b> - The format for date data type. Refer to <a href="../../references/Date_formats.html">Date formats</a> page for acceptable date formats.</span></li>
	<li><span><b>Precision</b> – The number of significant digits (all digits except leading zeros and trailing zeros after decimal point).</span></li>
	<li><span><b>Field Description</b> – The description for the field.</span></li>
</ul>
<p>
<ul>
	<li><span><b>Pull Schema</b> – The schema defined in operation editor's output, will be pulled to the schema tab. The current schema in the grid will be overwritten with the schema from operation editor's output.</span></li>
	<li><span><b>Load Schema</b> – The load schema button allows user to load the schema in Schema Tab from an external schema file selected by the user. </span></li>
	<li><span><b>Export Schema</b> – The export schema button allows user to export the current schema in the file provided by the user.</span></li>
</ul>	
</p>

<p><a name="validations"></a><span class="header-2">Validations</span></p>
<p><span>The GroupCombine components applies validations to the mandatory fields as described above. Upon placing the GroupCombine component on job canvas for the first time (from component palette), the component shows up a warning icon as mandatory properties are not provided.</span></p>
<img src="../../images/GroupCombine_Validation_Warning.PNG" alt="Warning icon displayed on component" />

<p><span>The properties window also displays error icon on mandatory fields if it has an incorrect value. The error icon is displayed on the tab as well, if any of the field within the tab has some error.</span></p>
<img src="../../images/Group_combine_Properties_Error.png" alt="Error icon displayed on tabs" />

<p><span>If the properties window has some error even after user visit's it once, then the warning icon on the GroupCombine component on the job canvas changes to error icon. This error icon is removed only when all the mandatory fields are supplied with correct values.</span></p>
<img src="../../images/GroupCombine_Validation_Error.PNG" alt="Error icon displayed on component" />

</body>
</html>